Financial Time Series Models

Due to the intense volatility that is often present in financial time series data, different modeling techniques are often necessary to capture unusual behavior and make informed forecasts. Thus, this page will visualize on all nine financial time series (all stock prices of companies in related industries) to give us a better idea of the economic landscape surrounding people’s transportation choices and the fallout of pandemic-induced phenomena. We will then select one time series per industry as a representative to avoid over-complicating our analysis with stocks that are overly-influenced by factors outside of those that impact public transit. As a reminder, these are the stock prices we will be looking at, from the start of 2020 to present day:

Rideshare Companies:

Telecommunications Companies:

Oil Companies:

Note: Much of the code and process on this page is re-purposed from:

Gamage, P. (2026). Applied Time Series for Data Science.

Code
library(tidyverse)
library(ggplot2)
library(forecast)
library(astsa) 
library(xts)
library(tseries)
library(fpp2)
library(fma)
library(lubridate)
library(tidyverse)
library(TSstudio)
library(quantmod)
library(tidyquant)
library(plotly)
library(vars)
library(gridExtra)
Code
uber <- read_csv("../data/uber.csv")
lyft <- read_csv("../data/lyft.csv")
zoom <- read_csv("../data/zoom.csv")
microsoft <- read_csv("../data/microsoft.csv")
cisco <- read_csv("../data/cisco.csv")
bp <- read_csv("../data/bp.csv")
shell <- read_csv("../data/shell.csv")
exxon <- read_csv("../data/exxon.csv")
chevron <- read_csv("../data/chevron.csv")
rideshare <- read_csv("../data/rideshare.csv")
telecom <- read_csv("../data/telecom.csv")
oil <- read_csv("../data/oil.csv")

Visualizing the data

In this section, we will be taking a look at the returns of each financial time series to comment on both the stationarity and volatility of the data. The financial data for these analyses will all be stock prices for companies in industries related to public transit usage, and returns will be calculated using adjusted closing prices.

Code
plot <- plot_ly(rideshare, x = ~Date) %>%
  add_lines(y = ~Uber, name = "Uber Closing Price", line = list(color = 'black')) %>%
  add_lines(y = ~Lyft, name = "Lyft Closing Price", line = list(color = '#cc00bb')) %>%
  layout(
    title = "Rideshare Stock prices",
    xaxis = list(title = "Date"),
    yaxis = list(title = "Price"),
    legend = list(title = list(text = "Legend"))
  )

plot
Code
uber_returns <- Delt(uber$UBER.Adjusted)
uber_returns_df <- data.frame(
  date = uber$date,
  return = uber_returns
)
uber_returns_df <- uber_returns_df[-1,]
fig <- plot_ly(uber_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Uber Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig
Code
lyft_returns <- Delt(uber$UBER.Adjusted)
lyft_returns_df <- data.frame(
  date = lyft$date,
  return = lyft_returns
)
lyft_returns_df <- lyft_returns_df[-1,]
fig <- plot_ly(lyft_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Lyft Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig

These plots show that Uber and Lyft stock prices follow similar patterns until mid-2022 where they diverge greatly. Both have high volatility in 2020 with frequent spikes in 2022 and 2024, showing that these stocks can be susceptible to outside events. Meanwhile neither appears to be stationary. As decided in the EDA section, we will continue analyzing the rideshare industry by fitting models to the Uber stock only.

Code
plot <- plot_ly(telecom, x = ~Date) %>%
  add_lines(y = ~Zoom, name = "Zoom Closing Price", line = list(color = 'blue')) %>%
  add_lines(y = ~Microsoft, name = "Microsoft Closing Price", line = list(color = '#008800')) %>%
  add_lines(y = ~Cisco, name = "Cisco Closing Price", line = list(color = '#bb0000')) %>%
  layout(
    title = "Telecommunications Stock prices",
    xaxis = list(title = "Date"),
    yaxis = list(title = "Price"),
    legend = list(title = list(text = "Legend"))
  )

plot
Code
microsoft_returns <- Delt(microsoft$MSFT.Adjusted)
microsoft_returns_df <- data.frame(
  date = microsoft$date,
  return = microsoft_returns
)
microsoft_returns_df <- microsoft_returns_df[-1,]
fig <- plot_ly(microsoft_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Microsoft Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig
Code
zoom_returns <- Delt(zoom$ZM.Adjusted)
zoom_returns_df <- data.frame(
  date = zoom$date,
  return = zoom_returns
)
zoom_returns_df <- zoom_returns_df[-1,]
fig <- plot_ly(zoom_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Zoom Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig
Code
cisco_returns <- Delt(cisco$CSCO.Adjusted)
cisco_returns_df <- data.frame(
  date = cisco$date,
  return = cisco_returns
)
cisco_returns_df <- cisco_returns_df[-1,]
fig <- plot_ly(cisco_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Cisco Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig

These stocks differ greatly from one another. Cisco appears to be the most stationary with a relatively steady price that is less impacted by events from the last five years, while the others are certainly non-stationary. Meanwhile, there is great volatility all-around in 2020, while each stock has various spikes that appear to not affect one another at the same time. Overall, there is more volatility here than with rideshare stocks. As decided in the EDA section, we will continue analyzing the telecommunications industry by fitting models to the Zoom stock only.

Code
plot <- plot_ly(oil, x = ~Date) %>%
  add_lines(y = ~BP, name = "BP Closing Price", line = list(color = '#66cc00')) %>%
  add_lines(y = ~Shell, name = "Shell Closing Price", line = list(color = '#cccc00')) %>%
  add_lines(y = ~Exxon, name = "Exxon Closing Price", line = list(color = '#bb0000')) %>%
  add_lines(y = ~Chevron, name = "Chevron Closing Price", line = list(color = '#0000bb')) %>%
  layout(
    title = "Oil Stock prices",
    xaxis = list(title = "Date"),
    yaxis = list(title = "Price"),
    legend = list(title = list(text = "Legend"))
  )

plot
Code
shell_returns <- Delt(shell$SHEL.Adjusted)
shell_returns_df <- data.frame(
  date = shell$date,
  return = shell_returns
)
shell_returns_df <- shell_returns_df[-1,]
fig <- plot_ly(shell_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Shell Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig
Code
bp_returns <- Delt(bp$BP.Adjusted)
bp_returns_df <- data.frame(
  date = bp$date,
  return = bp_returns
)
bp_returns_df <- bp_returns_df[-1,]
fig <- plot_ly(bp_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "BP Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig
Code
exxon_returns <- Delt(exxon$XOM.Adjusted)
exxon_returns_df <- data.frame(
  date = exxon$date,
  return = exxon_returns
)
exxon_returns_df <- exxon_returns_df[-1,]
fig <- plot_ly(exxon_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Exxon Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig
Code
chevron_returns <- Delt(chevron$CVX.Adjusted)
chevron_returns_df <- data.frame(
  date = chevron$date,
  return = chevron_returns
)
chevron_returns_df <- chevron_returns_df[-1,]
fig <- plot_ly(chevron_returns_df, x = ~date, y = ~Delt.1.arithmetic, type = 'scatter', mode = 'lines') %>%
  layout(title = "Chevron Daily Returns",
         xaxis = list(title = "Date"),
         yaxis = list(title = "Daily Return (Calculated by Adjusted Price)"))
fig

Each of these prices follows a very similar trend for the last five years, which tells us a lot about the nature of the oil industry. It appears to not quite be as volatile as the other industries, although there was certainly erratic movement in 2020. Additionally, each of these time series does not appear to be stationary. As decided in the EDA section, we will continue analyzing the oil industry by fitting models to the Exxon stock only.

Model Fitting

First, we will take a look at the ACF and PACF plots, as well as the ACF plots of the absolute value and squared returns to see what is needed prior to fitting a model.

Code
uber_acf <- ggAcf(uber_returns_df$Delt.1.arithmetic)
uber_pacf <- ggPacf(uber_returns_df$Delt.1.arithmetic)
uber_abs_acf <- ggAcf(abs(uber_returns_df$Delt.1.arithmetic))
uber_sq_acf <- ggAcf((uber_returns_df$Delt.1.arithmetic)^2)
grid.arrange(uber_acf, uber_pacf, uber_abs_acf, uber_sq_acf, nrow=4)

Based on the absolute value and squares, it appears that there is some autocorrelation at several lags for the returns, even if it is not as visible with the raw data. Therefore, a GARCH model should be used and the following steps will be done to fit both GARCH and ARIMA+GARCH models to see which performs better. \(p\) and \(q\) will be chosen via a grid search.

Code
uber_arch_models <- list()
cc <- 1

for (p in 1:7) {
  for (q in 1:7) {
    uber_arch_models[[cc]] <- garch((uber_returns_df$Delt.1.arithmetic)^2, order = c(q, p), trace = FALSE)
    cc <- cc + 1
  }
}

uber_ARCH_AIC <- sapply(uber_arch_models, AIC)
uber_best_index <- which.min(uber_ARCH_AIC)
print(uber_arch_models[[uber_best_index]])

Call:
garch(x = (uber_returns_df$Delt.1.arithmetic)^2, order = c(q,     p), trace = FALSE)

Coefficient(s):
       a0         a1         b1  
1.949e-06  1.428e+00  2.913e-01  
Code
summary(uber_arch_models[[uber_best_index]])

Call:
garch(x = (uber_returns_df$Delt.1.arithmetic)^2, order = c(q,     p), trace = FALSE)

Model:
GARCH(1,1)

Residuals:
     Min       1Q   Median       3Q      Max 
 0.00000  0.02639  0.13479  0.39806 12.68623 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 1.949e-06   3.854e-08    50.58   <2e-16 ***
a1 1.428e+00   3.143e-02    45.44   <2e-16 ***
b1 2.913e-01   9.292e-03    31.35   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 186951, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 0.18625, df = 1, p-value = 0.6661
Code
uber_best_ARCH_fit <- fGarch::garchFit(
  formula = ~ garch(1, 2), 
  data = (uber_returns_df$Delt.1.arithmetic)^2, 
  trace = FALSE, 
  cond.dist = "norm"
)

summary(uber_best_ARCH_fit)

Title:
 GARCH Modelling 

Call:
 fGarch::garchFit(formula = ~garch(1, 2), data = (uber_returns_df$Delt.1.arithmetic)^2, 
    cond.dist = "norm", trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(1, 2)
<environment: 0x7f8a8fb59e18>
 [data = (uber_returns_df$Delt.1.arithmetic)^2]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1       beta2  
6.1623e-04  2.0665e-06  1.0000e+00  3.0219e-01  1.0000e-08  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     6.162e-04   6.613e-05    9.318  < 2e-16 ***
omega  2.066e-06   5.782e-07    3.574 0.000352 ***
alpha1 1.000e+00   1.826e-01    5.477 4.33e-08 ***
beta1  3.022e-01   3.477e-01    0.869 0.384799    
beta2  1.000e-08   2.158e-01    0.000 1.000000    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 6259.749    normalized:  4.667971 

Description:
 Tue May  6 21:46:39 2025 by user:  


Standardised Residuals Tests:
                                   Statistic     p-Value
 Jarque-Bera Test   R    Chi^2  2.051973e+05 0.000000000
 Shapiro-Wilk Test  R    W      4.427170e-01 0.000000000
 Ljung-Box Test     R    Q(10)  1.711598e+01 0.071836759
 Ljung-Box Test     R    Q(15)  3.342570e+01 0.004096922
 Ljung-Box Test     R    Q(20)  4.202570e+01 0.002744335
 Ljung-Box Test     R^2  Q(10)  8.599375e-01 0.999914255
 Ljung-Box Test     R^2  Q(15)  9.541431e-01 0.999999818
 Ljung-Box Test     R^2  Q(20)  1.386222e+00 0.999999996
 LM Arch Test       R    TR^2   9.656504e-01 0.999988346

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-9.328485 -9.309092 -9.328512 -9.321220 

All but one of the GARCH(1,2) model coefficients is statistically significant, meaning the conditional variance is impacted by past squared returns. Prior to fitting an ARIMA+GARCH model, we must find optimal ARIMA parameters using auto.arima().

Code
uber_returns <- na.omit(uber_returns)
uber_ts <- ts(uber_returns, start=decimal_date(as.Date("2020-01-01")), frequency = 252)
Code
ARIMA.c <- function(p_min, p_max, q_min, q_max, data) {
  uber_results <- matrix(NA, nrow = 0, ncol = 6)
  colnames(uber_results) <- c("p", "d", "q", "AIC", "BIC", "AICc")

  for (p in p_min:p_max) {
    for (q in q_min:q_max) {
      for (d in 0:2) {
        if ((p + d + q) <= 6) {
          fit <- Arima(data, order = c(p, d, q))
          metrics <- c(p, d, q, fit$aic, fit$bic, fit$aicc)
          uber_results <- rbind(uber_results, metrics)
        }
      }
    }
  }

  # Convert to data frame
  uber_results_df <- as.data.frame(uber_results)
  colnames(uber_results_df) <- c("p", "d", "q", "AIC", "BIC", "AICc")
  return(uber_results_df)
}

uber_output <- ARIMA.c(0, 7, 0, 7, data = uber_ts)
uber_output[which.min(uber_output$AIC), ]
           p d q       AIC       BIC      AICc
metrics.48 3 0 1 -5259.465 -5228.258 -5259.402
Code
auto.arima(uber_ts)
Series: uber_ts 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
        mean
      0.0013
s.e.  0.0009

sigma^2 = 0.001161:  log likelihood = 2629.1
AIC=-5254.2   AICc=-5254.19   BIC=-5243.8

Since this returns a model of ARIMA(0,0,0), we are unlikely to gain much from adding it to this model, so the ARIMA(2,0,3) will be selected. The following results are diagnostics from this model.

Code
uber_arima <- Arima(uber_ts, order = c(2,0,3))
uber_arima_res <- residuals(uber_arima)
summary(uber_arima)
Series: uber_ts 
ARIMA(2,0,3) with non-zero mean 

Coefficients:
         ar1      ar2      ma1     ma2      ma3    mean
      1.2656  -0.3372  -1.2864  0.4374  -0.0987  0.0013
s.e.  0.2121   0.2366   0.2112  0.2391   0.0292  0.0007

sigma^2 = 0.001153:  log likelihood = 2636.6
AIC=-5259.2   AICc=-5259.12   BIC=-5222.79

Training set error measures:
                       ME       RMSE       MAE MPE MAPE      MASE         ACF1
Training set 3.196353e-05 0.03387395 0.0234167 NaN  Inf 0.6908066 0.0006858808
Code
uber_res_acf <- ggAcf(uber_arima_res, lag.max = 30)
uber_res_pacf <- ggPacf(uber_arima_res, lag.max = 30)
uber_res_abs_acf <- ggAcf(abs(uber_arima_res), lag.max = 30)
uber_res_sq_acf <- ggAcf((uber_arima_res)^2, lag.max = 30)
grid.arrange(uber_res_acf, uber_res_pacf, uber_res_abs_acf, uber_res_sq_acf, nrow=4)

Here we still see some large ACF values in the absolute value and squared data, so we will need to fit models and perform diagnostics to evaluate which model is best.

Code
uber_arima_arch_models <- list()
cc <- 1

for (p in 1:7) {
  for (q in 1:7) {
    uber_arima_arch_models[[cc]] <- garch(uber_arima_res, order = c(q, p), trace = FALSE)
    cc <- cc + 1
  }
}

uber_ARIMA_ARCH_AIC <- sapply(uber_arima_arch_models, AIC)
uber_best_arima_index <- which.min(uber_ARIMA_ARCH_AIC)
print(uber_arima_arch_models[[uber_best_arima_index]])

Call:
garch(x = uber_arima_res, order = c(q, p), trace = FALSE)

Coefficient(s):
       a0         a1         b1  
0.0001267  0.1405096  0.7437665  
Code
summary(uber_arima_arch_models[[uber_best_arima_index]])

Call:
garch(x = uber_arima_res, order = c(q, p), trace = FALSE)

Model:
GARCH(1,1)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.87441 -0.58087 -0.02373  0.54840  6.03656 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 1.267e-04   2.185e-05    5.797 6.76e-09 ***
a1 1.405e-01   1.870e-02    7.515 5.68e-14 ***
b1 7.438e-01   3.497e-02   21.267  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 689.74, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 0.017738, df = 1, p-value = 0.894
Code
uber_best_ARIMA_ARCH_fit <- fGarch::garchFit(
  formula = ~ garch(1, 1), 
  data = uber_arima_res, 
  trace = FALSE, 
  cond.dist = "norm"
)

summary(uber_best_ARIMA_ARCH_fit)

Title:
 GARCH Modelling 

Call:
 fGarch::garchFit(formula = ~garch(1, 1), data = uber_arima_res, 
    cond.dist = "norm", trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x7f8a47c0c588>
 [data = uber_arima_res]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
0.00031964  0.00012767  0.14125731  0.74191292  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     3.196e-04   7.949e-04    0.402 0.687587    
omega  1.277e-04   4.581e-05    2.787 0.005325 ** 
alpha1 1.413e-01   3.935e-02    3.590 0.000331 ***
beta1  7.419e-01   7.420e-02    9.999  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2756.02    normalized:  2.055198 

Description:
 Tue May  6 21:47:13 2025 by user:  


Standardised Residuals Tests:
                                 Statistic   p-Value
 Jarque-Bera Test   R    Chi^2  692.817441 0.0000000
 Shapiro-Wilk Test  R    W        0.965332 0.0000000
 Ljung-Box Test     R    Q(10)    3.163394 0.9773130
 Ljung-Box Test     R    Q(15)   12.844822 0.6142825
 Ljung-Box Test     R    Q(20)   23.423464 0.2684829
 Ljung-Box Test     R^2  Q(10)    2.206664 0.9944967
 Ljung-Box Test     R^2  Q(15)    6.986083 0.9580348
 Ljung-Box Test     R^2  Q(20)    8.589896 0.9871889
 LM Arch Test       R    TR^2     2.689235 0.9973686

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-4.104430 -4.088916 -4.104448 -4.098618 

This model has the higher p-value from the Box-Ljung test, as well as significance for each coefficient in the model, making it the chosen model for this data. A p-value of 0.9482 indicates that there is no significant evidence of autocorrelation in residuals, meaning using the ARIMA residuals proved useful.

Code
zoom_acf <- ggAcf(zoom_returns_df$Delt.1.arithmetic)
zoom_pacf <- ggPacf(zoom_returns_df$Delt.1.arithmetic)
zoom_abs_acf <- ggAcf(abs(zoom_returns_df$Delt.1.arithmetic))
zoom_sq_acf <- ggAcf((zoom_returns_df$Delt.1.arithmetic)^2)
grid.arrange(zoom_acf, zoom_pacf, zoom_abs_acf, zoom_sq_acf, nrow=4)

Code
zoom_arch_models <- list()
cc <- 1

for (p in 1:7) {
  for (q in 1:7) {
    zoom_arch_models[[cc]] <- garch(zoom_returns_df$Delt.1.arithmetic, order = c(q, p), trace = FALSE)
    cc <- cc + 1
  }
}

zoom_ARCH_AIC <- sapply(zoom_arch_models, AIC)
zoom_best_index <- which.min(zoom_ARCH_AIC)
print(zoom_arch_models[[zoom_best_index]])

Call:
garch(x = zoom_returns_df$Delt.1.arithmetic, order = c(q, p),     trace = FALSE)

Coefficient(s):
       a0         a1         b1         b2  
2.345e-05  1.590e-01  3.681e-09  8.458e-01  
Code
summary(zoom_arch_models[[zoom_best_index]])

Call:
garch(x = zoom_returns_df$Delt.1.arithmetic, order = c(q, p),     trace = FALSE)

Model:
GARCH(2,1)

Residuals:
     Min       1Q   Median       3Q      Max 
-6.59060 -0.51884 -0.02349  0.54136 10.96197 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 2.345e-05   5.050e-06    4.642 3.45e-06 ***
a1 1.590e-01   1.119e-02   14.215  < 2e-16 ***
b1 3.681e-09   6.847e-03    0.000        1    
b2 8.458e-01   1.354e-02   62.466  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 12442, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 0.36637, df = 1, p-value = 0.545
Code
zoom_best_ARCH_fit <- fGarch::garchFit(
  formula = ~ garch(2, 1), 
  data = zoom_returns_df$Delt.1.arithmetic, 
  trace = FALSE, 
  cond.dist = "norm"
)

summary(zoom_best_ARCH_fit)

Title:
 GARCH Modelling 

Call:
 fGarch::garchFit(formula = ~garch(2, 1), data = zoom_returns_df$Delt.1.arithmetic, 
    cond.dist = "norm", trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(2, 1)
<environment: 0x7f8aabba49b0>
 [data = zoom_returns_df$Delt.1.arithmetic]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1       alpha2        beta1  
-0.00046513   0.00004943   0.14661534   0.00000001   0.83143974  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -4.651e-04   7.377e-04   -0.630    0.528    
omega   4.943e-05         NaN      NaN      NaN    
alpha1  1.466e-01   2.128e-02    6.889  5.6e-12 ***
alpha2  1.000e-08         NaN      NaN      NaN    
beta1   8.314e-01         NaN      NaN      NaN    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2702.637    normalized:  2.015389 

Description:
 Tue May  6 21:47:14 2025 by user:  


Standardised Residuals Tests:
                                   Statistic   p-Value
 Jarque-Bera Test   R    Chi^2  5020.5278002 0.0000000
 Shapiro-Wilk Test  R    W         0.9250622 0.0000000
 Ljung-Box Test     R    Q(10)    10.2111812 0.4221652
 Ljung-Box Test     R    Q(15)    15.6764673 0.4038735
 Ljung-Box Test     R    Q(20)    19.1373805 0.5129129
 Ljung-Box Test     R^2  Q(10)     5.6426543 0.8443395
 Ljung-Box Test     R^2  Q(15)     6.7425085 0.9644050
 Ljung-Box Test     R^2  Q(20)     9.6712697 0.9737486
 LM Arch Test       R    TR^2      5.9684340 0.9176649

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-4.023321 -4.003928 -4.023348 -4.016056 
Code
zoom_returns <- na.omit(zoom_returns)
zoom_ts <- ts(zoom_returns, start=decimal_date(as.Date("2020-01-01")), frequency = 252)
Code
zoom_output <- ARIMA.c(0, 7, 0, 7, data = zoom_ts)
zoom_output[which.min(zoom_output$AIC), ]
        p d q       AIC       BIC      AICc
metrics 0 0 0 -5147.438 -5137.036 -5147.429
Code
#auto.arima(zoom_ts)
Code
zoom_arima <- Arima(zoom_ts, order = c(0,1,0))
zoom_arima_res <- residuals(zoom_arima)
summary(zoom_arima)
Series: zoom_ts 
ARIMA(0,1,0) 

sigma^2 = 0.002527:  log likelihood = 2105.59
AIC=-4209.18   AICc=-4209.17   BIC=-4203.98

Training set error measures:
                      ME       RMSE        MAE MPE MAPE      MASE       ACF1
Training set 1.56105e-05 0.05025495 0.03416894 NaN  Inf 0.9690134 -0.4810518
Code
zoom_res_acf <- ggAcf(zoom_arima_res, lag.max = 30)
zoom_res_pacf <- ggPacf(zoom_arima_res, lag.max = 30)
zoom_res_abs_acf <- ggAcf(abs(zoom_arima_res), lag.max = 30)
zoom_res_sq_acf <- ggAcf((zoom_arima_res)^2, lag.max = 30)
grid.arrange(zoom_res_acf, zoom_res_pacf, zoom_res_abs_acf, zoom_res_sq_acf, nrow=4)

Code
zoom_arima_arch_models <- list()
cc <- 1

for (p in 1:7) {
  for (q in 1:7) {
    zoom_arima_arch_models[[cc]] <- garch(zoom_arima_res, order = c(q, p), trace = FALSE)
    cc <- cc + 1
  }
}

zoom_ARIMA_ARCH_AIC <- sapply(zoom_arima_arch_models, AIC)
zoom_best_arima_index <- which.min(zoom_ARIMA_ARCH_AIC)
print(zoom_arima_arch_models[[zoom_best_arima_index]])

Call:
garch(x = zoom_arima_res, order = c(q, p), trace = FALSE)

Coefficient(s):
       a0         a1         b1         b2  
0.0005101  0.6296091  0.1153870  0.1435391  
Code
summary(zoom_arima_arch_models[[zoom_best_arima_index]])

Call:
garch(x = zoom_arima_res, order = c(q, p), trace = FALSE)

Model:
GARCH(2,1)

Residuals:
     Min       1Q   Median       3Q      Max 
-6.02883 -0.56911  0.02483  0.57428  5.26561 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 5.101e-04   4.451e-05   11.461  < 2e-16 ***
a1 6.296e-01   4.574e-02   13.766  < 2e-16 ***
b1 1.154e-01   3.757e-02    3.071  0.00213 ** 
b2 1.435e-01   3.309e-02    4.337 1.44e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 458.98, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 3.317, df = 1, p-value = 0.06857
Code
zoom_best_ARIMA_ARCH_fit <- fGarch::garchFit(
  formula = ~ garch(4, 1), 
  data = zoom_arima_res, 
  trace = FALSE, 
  cond.dist = "norm"
)

summary(zoom_best_ARIMA_ARCH_fit)

Title:
 GARCH Modelling 

Call:
 fGarch::garchFit(formula = ~garch(4, 1), data = zoom_arima_res, 
    cond.dist = "norm", trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(4, 1)
<environment: 0x7f8a4790fac0>
 [data = zoom_arima_res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1       alpha2       alpha3       alpha4  
-0.00015610   0.00063477   0.64901264   0.00000001   0.07379943   0.01258282  
      beta1  
 0.12488338  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -1.561e-04   7.995e-04   -0.195    0.845    
omega   6.348e-04   6.928e-04    0.916    0.360    
alpha1  6.490e-01   7.546e-02    8.600   <2e-16 ***
alpha2  1.000e-08   5.858e-01    0.000    1.000    
alpha3  7.380e-02   9.666e-02    0.764    0.445    
alpha4  1.258e-02   6.833e-02    0.184    0.854    
beta1   1.249e-01   9.108e-01    0.137    0.891    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2340.65    normalized:  1.745451 

Description:
 Tue May  6 21:47:20 2025 by user:  


Standardised Residuals Tests:
                                  Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  435.9599472 0.000000e+00
 Shapiro-Wilk Test  R    W        0.9751553 1.943836e-14
 Ljung-Box Test     R    Q(10)  188.2911403 0.000000e+00
 Ljung-Box Test     R    Q(15)  206.1795330 0.000000e+00
 Ljung-Box Test     R    Q(20)  212.9919943 0.000000e+00
 Ljung-Box Test     R^2  Q(10)   11.1209602 3.481721e-01
 Ljung-Box Test     R^2  Q(15)   19.3798599 1.970336e-01
 Ljung-Box Test     R^2  Q(20)   26.8412586 1.397894e-01
 LM Arch Test       R    TR^2    16.8227267 1.563883e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.480462 -3.453312 -3.480517 -3.470291 
Code
exxon_acf <- ggAcf(exxon_returns_df$Delt.1.arithmetic)
exxon_pacf <- ggPacf(exxon_returns_df$Delt.1.arithmetic)
exxon_abs_acf <- ggAcf(abs(exxon_returns_df$Delt.1.arithmetic))
exxon_sq_acf <- ggAcf((exxon_returns_df$Delt.1.arithmetic)^2)
grid.arrange(exxon_acf, exxon_pacf, exxon_abs_acf, exxon_sq_acf, nrow=4)

Code
exxon_arch_models <- list()
cc <- 1

for (p in 1:7) {
  for (q in 1:7) {
    exxon_arch_models[[cc]] <- garch(exxon_returns_df$Delt.1.arithmetic, order = c(q, p), trace = FALSE)
    cc <- cc + 1
  }
}

exxon_ARCH_AIC <- sapply(exxon_arch_models, AIC)
exxon_best_index <- which.min(exxon_ARCH_AIC)
print(exxon_arch_models[[exxon_best_index]])

Call:
garch(x = exxon_returns_df$Delt.1.arithmetic, order = c(q, p),     trace = FALSE)

Coefficient(s):
       a0         a1         b1  
4.374e-06  6.351e-02  9.276e-01  
Code
summary(exxon_arch_models[[exxon_best_index]])

Call:
garch(x = exxon_returns_df$Delt.1.arithmetic, order = c(q, p),     trace = FALSE)

Model:
GARCH(1,1)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.85609 -0.57861  0.01957  0.64062  5.43256 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 4.374e-06   1.719e-06    2.544    0.011 *  
a1 6.351e-02   8.217e-03    7.729 1.09e-14 ***
b1 9.276e-01   1.026e-02   90.446  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 91.429, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 0.062978, df = 1, p-value = 0.8018
Code
exxon_best_ARCH_fit <- fGarch::garchFit(
  formula = ~ garch(1, 1), 
  data = exxon_returns_df$Delt.1.arithmetic, 
  trace = FALSE, 
  cond.dist = "norm"
)

summary(exxon_best_ARCH_fit)

Title:
 GARCH Modelling 

Call:
 fGarch::garchFit(formula = ~garch(1, 1), data = exxon_returns_df$Delt.1.arithmetic, 
    cond.dist = "norm", trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x7f8a448ece80>
 [data = exxon_returns_df$Delt.1.arithmetic]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
6.1291e-04  4.3872e-06  6.3160e-02  9.2778e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     6.129e-04   4.674e-04    1.311   0.1897    
omega  4.387e-06   1.994e-06    2.200   0.0278 *  
alpha1 6.316e-02   1.109e-02    5.693 1.25e-08 ***
beta1  9.278e-01   1.299e-02   71.421  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 3410.371    normalized:  2.543155 

Description:
 Tue May  6 21:47:20 2025 by user:  


Standardised Residuals Tests:
                                 Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  91.5452580 0.000000e+00
 Shapiro-Wilk Test  R    W       0.9903529 1.026791e-07
 Ljung-Box Test     R    Q(10)   5.1918937 8.779961e-01
 Ljung-Box Test     R    Q(15)   8.4761832 9.032777e-01
 Ljung-Box Test     R    Q(20)  16.4593835 6.877472e-01
 Ljung-Box Test     R^2  Q(10)  13.0228510 2.223970e-01
 Ljung-Box Test     R^2  Q(15)  15.2127824 4.362015e-01
 Ljung-Box Test     R^2  Q(20)  18.2678481 5.697688e-01
 LM Arch Test       R    TR^2   12.9114665 3.755084e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-5.080345 -5.064831 -5.080363 -5.074533 
Code
exxon_returns <- na.omit(exxon_returns)
exxon_ts <- ts(exxon_returns, start=decimal_date(as.Date("2020-01-01")), frequency = 252)
Code
exxon_output <- ARIMA.c(0, 7, 0, 7, data = exxon_ts)
exxon_output[which.min(exxon_output$AIC), ]
           p d q       AIC       BIC      AICc
metrics.59 4 0 2 -6499.225 -6457.615 -6499.117
Code
auto.arima(exxon_ts)
Series: exxon_ts 
ARIMA(0,0,0) with zero mean 

sigma^2 = 0.0004625:  log likelihood = 3245.94
AIC=-6489.87   AICc=-6489.87   BIC=-6484.67
Code
exxon_arima <- Arima(exxon_ts, order = c(4,0,2))
exxon_arima_res <- residuals(exxon_arima)
summary(exxon_arima)
Series: exxon_ts 
ARIMA(4,0,2) with non-zero mean 

Coefficients:
         ar1      ar2      ar3     ar4      ma1     ma2   mean
      0.8285  -0.8907  -0.0730  0.0538  -0.8632  0.9438  7e-04
s.e.  0.0401   0.0525   0.0368  0.0293   0.0301  0.0406  6e-04

sigma^2 = 0.0004568:  log likelihood = 3257.61
AIC=-6499.22   AICc=-6499.12   BIC=-6457.62

Training set error measures:
                       ME       RMSE        MAE MPE MAPE      MASE        ACF1
Training set 7.827952e-06 0.02131613 0.01537068 NaN  Inf 0.7349775 0.001259489
Code
exxon_res_acf <- ggAcf(exxon_arima_res, lag.max = 30)
exxon_res_pacf <- ggPacf(exxon_arima_res, lag.max = 30)
exxon_res_abs_acf <- ggAcf(abs(exxon_arima_res), lag.max = 30)
exxon_res_sq_acf <- ggAcf((exxon_arima_res)^2, lag.max = 30)
grid.arrange(exxon_res_acf, exxon_res_pacf, exxon_res_abs_acf, exxon_res_sq_acf, nrow=4)

Code
exxon_arima_arch_models <- list()
cc <- 1

for (p in 1:7) {
  for (q in 1:7) {
    exxon_arima_arch_models[[cc]] <- garch(exxon_arima_res, order = c(q, p), trace = FALSE)
    cc <- cc + 1
  }
}

exxon_ARIMA_ARCH_AIC <- sapply(exxon_arima_arch_models, AIC)
exxon_best_arima_index <- which.min(exxon_ARIMA_ARCH_AIC)
print(exxon_arima_arch_models[[exxon_best_arima_index]])

Call:
garch(x = exxon_arima_res, order = c(q, p), trace = FALSE)

Coefficient(s):
       a0         a1         b1  
4.653e-06  6.505e-02  9.252e-01  
Code
summary(exxon_arima_arch_models[[exxon_best_arima_index]])

Call:
garch(x = exxon_arima_res, order = c(q, p), trace = FALSE)

Model:
GARCH(1,1)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.95487 -0.62976 -0.02761  0.60640  5.38824 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 4.653e-06   1.808e-06    2.574   0.0101 *  
a1 6.505e-02   8.577e-03    7.584 3.35e-14 ***
b1 9.252e-01   1.079e-02   85.739  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 82.453, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 0.40896, df = 1, p-value = 0.5225
Code
exxon_best_ARIMA_ARCH_fit <- fGarch::garchFit(
  formula = ~ garch(1, 1), 
  data = exxon_arima_res, 
  trace = FALSE, 
  cond.dist = "norm"
)

summary(exxon_best_ARIMA_ARCH_fit)

Title:
 GARCH Modelling 

Call:
 fGarch::garchFit(formula = ~garch(1, 1), data = exxon_arima_res, 
    cond.dist = "norm", trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x7f8a45854c50>
 [data = exxon_arima_res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1        beta1  
-3.0363e-05   4.6769e-06   6.5068e-02   9.2514e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -3.036e-05   4.675e-04   -0.065   0.9482    
omega   4.677e-06   2.185e-06    2.141   0.0323 *  
alpha1  6.507e-02   1.214e-02    5.361 8.27e-08 ***
beta1   9.251e-01   1.449e-02   63.846  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 3414.361    normalized:  2.54613 

Description:
 Tue May  6 21:47:55 2025 by user:  


Standardised Residuals Tests:
                                 Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  82.8896071 0.000000e+00
 Shapiro-Wilk Test  R    W       0.9911075 3.013443e-07
 Ljung-Box Test     R    Q(10)   6.4650500 7.747968e-01
 Ljung-Box Test     R    Q(15)   8.4882056 9.027236e-01
 Ljung-Box Test     R    Q(20)  11.6279458 9.282868e-01
 Ljung-Box Test     R^2  Q(10)  11.1969052 3.423845e-01
 Ljung-Box Test     R^2  Q(15)  14.3841572 4.966158e-01
 Ljung-Box Test     R^2  Q(20)  18.3572348 5.638882e-01
 LM Arch Test       R    TR^2   11.2019205 5.116980e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-5.086295 -5.070780 -5.086312 -5.080483 

Model Equations

The model chosen above is an ARIMA+GARCH model that uses ARIMA(2,0,3) and GARCH(1,1). Therefore, the mathematical representation of this is as follows:

ARIMA: \[\phi(B)x_t = \theta(B)w_t\] where \[\phi(B) = 0.0013 + 1.4666B - 0.9392B^2\] and \[\theta(B) = 0.0013 - 1.4919B + 1.0157B^2 - 0.0352B^3\]

GARCH: \[\sigma^2_t = (1.3141 \times 10^{-04}) + (1.4871 \times 10^{-1})w^2_{t-1} + (7.2877 \times 10^{-1})\sigma^2_{t-1}\]

Note: method of getting model equation is from Gamage, P. (2026). Applied Time Series for Data Science.

The model chosen above is an ARIMA+GARCH model that uses ARIMA(0,1,0) and GARCH(4,1). Therefore, the mathematical representation of this is as follows:

ARIMA: \[\phi(B)(1-B)x_t = \theta(B)w_t\] where \[\phi(B) = 1\] and \[\theta(B) = 1\]

GARCH: \[\sigma_t = (1.603 \times 10^{-04}) + (3.806 \times 10^{-1})w_{t-1} + (8.381 \times 10^{-2})\sigma_{t-1} + (1.928 \times 10^{-7})\sigma_{t-2} + (1.930 \times 10^{-1})\sigma_{t-3} + (2.949 \times 10^{-1})\sigma_{t-4}\]

Note: method of getting model equation is from Gamage, P. (2026). Applied Time Series for Data Science.

The model chosen above is an ARIMA+GARCH model that uses ARIMA(4,0,2) and GARCH(1,1). Therefore, the mathematical representation of this is as follows:

0.8356 -0.9082 -0.0623 0.0436 -0.8702 0.9533

ARIMA: \[\phi(B)x_t = \theta(B)w_t\] where \[\phi(B) = 1 + 0.8356B - 0.9082B^2 - 0.0623B^3 + 0.0436B^4\] and \[\theta(B) = 1 - 0.8702B + 0.9533B^2\]

GARCH: \[\sigma_t = (3.612 \times 10^{-06}) + (6.072 \times 10^{-2})w_{t-1} + (9.313 \times 10^{-1})\sigma_{t-1}\]

Note: method of getting model equation is from Gamage, P. (2026). Applied Time Series for Data Science.